Goto

Collaborating Authors

 room segmentation


Traversability-aware Consistent Situational Graphs for Indoor Localization and Mapping

Kim, Jeewon, Oh, Minho, Myung, Hyun

arXiv.org Artificial Intelligence

Scene graphs enhance 3D mapping capabilities in robotics by understanding the relationships between different spatial elements, such as rooms and objects. Recent research extends scene graphs to hierarchical layers, adding and leveraging constraints across these levels. This approach is tightly integrated with pose-graph optimization, improving both localization and mapping accuracy simultaneously. However, when segmenting spatial characteristics, consistently recognizing rooms becomes challenging due to variations in viewpoints and limited field of view (FOV) of sensors. For example, existing real-time approaches often over-segment large rooms into smaller, non-functional spaces that are not useful for localization and mapping due to the time-dependent method. Conversely, their voxel-based room segmentation method often under-segment in complex cases like not fully enclosed 3D space that are non-traversable for ground robots or humans, leading to false constraints in pose-graph optimization. We propose a traversability-aware room segmentation method that considers the interaction between robots and surroundings, with consistent feasibility of traversability information. This enhances both the semantic coherence and computational efficiency of pose-graph optimization. Improved performance is demonstrated through the re-detection frequency of the same rooms in a dataset involving repeated traversals of the same space along the same path, as well as the optimization time consumption.


TACS-Graphs: Traversability-Aware Consistent Scene Graphs for Ground Robot Localization and Mapping

Kim, Jeewon, Oh, Minho, Myung, Hyun

arXiv.org Artificial Intelligence

Abstract-- Scene graphs have emerged as a powerful tool for robots, providing a structured representation of spatial and semantic relationships for advanced task planning. Despite their potential, conventional 3D indoor scene graphs face critical limitations, particularly under-and over-segmentation of room layers in structurally complex environments. Under-segmentation misclassifies non-traversable areas as part of a room, often in open spaces, while over-segmentation fragments a single room into overlapping segments in complex environments. These issues stem from naive voxel-based map representations that rely solely on geometric proximity, disregarding the structural constraints of traversable spaces and resulting in inconsistent room layers within scene graphs. T o the best of our knowledge, this work is the first to tackle segmentation inconsistency as a challenge and address it with Traversability-A ware Consistent Scene Graphs (T ACS-Graphs), a novel framework that integrates ground robot traversability with room segmentation. By leveraging traversability as a key factor in defining room boundaries, the proposed method achieves a more semantically meaningful and topologically coherent segmentation, effectively mitigating the inaccuracies of voxel-based scene graph approaches in complex environments. Furthermore, the enhanced segmentation consistency improves loop closure detection efficiency in the proposed Consistent Scene Graph-leveraging Loop Closure Detection (CoSG-LCD) leading to higher pose estimation accuracy. Experimental results confirm that the proposed approach outperforms state-of-the-art methods in terms of scene graph consistency and pose graph optimization performance.


Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation

Xu, Yifan, Luo, Ziming, Wang, Qianwei, Kamat, Vineet, Menassa, Carol

arXiv.org Artificial Intelligence

Current open-vocabulary scene graph generation algorithms highly rely on both 3D scene point cloud data and posed RGB-D images and thus have limited applications in scenarios where RGB-D images or camera poses are not readily available. To solve this problem, we propose Point2Graph, a novel end-to-end point cloud-based 3D open-vocabulary scene graph generation framework in which the requirement of posed RGB-D image series is eliminated. This hierarchical framework contains room and object detection/segmentation and open-vocabulary classification. For the room layer, we leverage the advantage of merging the geometry-based border detection algorithm with the learning-based region detection to segment rooms and create a "Snap-Lookup" framework for open-vocabulary room classification. In addition, we create an end-to-end pipeline for the object layer to detect and classify 3D objects based solely on 3D point cloud data. Our evaluation results show that our framework can outperform the current state-of-the-art (SOTA) open-vocabulary object and room segmentation and classification algorithm on widely used real-scene datasets.


QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding

Mehan, Yash, Gupta, Kumaraditya, Jayanti, Rohit, Govil, Anirudh, Garg, Sourav, Krishna, Madhava

arXiv.org Artificial Intelligence

Understanding the structural organisation of 3D indoor scenes in terms of rooms is often accomplished via floorplan extraction. Robotic tasks such as planning and navigation require a semantic understanding of the scene as well. This is typically achieved via object-level semantic segmentation. However, such methods struggle to segment out topological regions like "kitchen" in the scene. In this work, we introduce a two-step pipeline. First, we extract a topological map, i.e., floorplan of the indoor scene using a novel multi-channel occupancy representation. Then, we generate CLIP-aligned features and semantic labels for every room instance based on the objects it contains using a self-attention transformer. Our language-topology alignment supports natural language querying, e.g., a "place to cook" locates the "kitchen". We outperform the current state-of-the-art on room segmentation by ~20% and room classification by ~12%. Our detailed qualitative analysis and ablation studies provide insights into the problem of joint structural and semantic 3D scene understanding.


Semantic Layering in Room Segmentation via LLMs

Kim, Taehyeon, Min, Byung-Cheol

arXiv.org Artificial Intelligence

In this paper, we introduce Semantic Layering in Room Segmentation via LLMs (SeLRoS), an advanced method for semantic room segmentation by integrating Large Language Models (LLMs) with traditional 2D map-based segmentation. Unlike previous approaches that solely focus on the geometric segmentation of indoor environments, our work enriches segmented maps with semantic data, including object identification and spatial relationships, to enhance robotic navigation. By leveraging LLMs, we provide a novel framework that interprets and organizes complex information about each segmented area, thereby improving the accuracy and contextual relevance of room segmentation. Furthermore, SeLRoS overcomes the limitations of existing algorithms by using a semantic evaluation method to accurately distinguish true room divisions from those erroneously generated by furniture and segmentation inaccuracies. The effectiveness of SeLRoS is verified through its application across 30 different 3D environments. Source code and experiment videos for this work are available at: https://sites.google.com/view/selros.


Real-Time Line-Based Room Segmentation and Continuous Euclidean Distance Fields

Warberg, Erik, Miksits, Adam, Barbosa, Fernando S.

arXiv.org Artificial Intelligence

Continuous maps representations, as opposed to traditional discrete ones such as grid maps, have been gaining traction in the research community. However, current approaches still suffer from high computation costs, making them unable to be used in large environments without sacrificing precision. In this paper, a scalable method building upon Gaussian Process-based Euclidean Distance Fields (GP-EDFs) is proposed. By leveraging structure inherent to indoor environments, namely walls and rooms, we achieve an accurate continuous map representation that is fast enough to be updated and used in real-time. This is possible thanks to a novel line-based room segmentation algorithm, enabling the creation of smaller local GP-EDFs for each room, which in turn also use line segments as its shape priors, thus representing the map more efficiently with fewer data points. We evaluate this method in simulation experiments, and make the code available open-source.


S-Graphs+: Real-time Localization and Mapping leveraging Hierarchical Representations

Bavle, Hriday, Sanchez-Lopez, Jose Luis, Shaheer, Muhammad, Civera, Javier, Voos, Holger

arXiv.org Artificial Intelligence

In this paper, we present an evolved version of Situational Graphs, which jointly models in a single optimizable factor graph (1) a pose graph, as a set of robot keyframes comprising associated measurements and robot poses, and (2) a 3D scene graph, as a high-level representation of the environment that encodes its different geometric elements with semantic attributes and the relational information between them. Specifically, our S-Graphs+ is a novel four-layered factor graph that includes: (1) a keyframes layer with robot pose estimates, (2) a walls layer representing wall surfaces, (3) a rooms layer encompassing sets of wall planes, and (4) a floors layer gathering the rooms within a given floor level. The above graph is optimized in real-time to obtain a robust and accurate estimate of the robots pose and its map, simultaneously constructing and leveraging high-level information of the environment. To extract this high-level information, we present novel room and floor segmentation algorithms utilizing the mapped wall planes and free-space clusters. We tested S-Graphs+ on multiple datasets, including simulated and real data of indoor environments from varying construction sites, and on a real public dataset of several indoor office areas. On average over our datasets, S-Graphs+ outperforms the accuracy of the second-best method by a margin of 10.67%, while extending the robot situational awareness by a richer scene model. Moreover, we make the software available as a docker file.